智能论文笔记

Domain Engineering for Applied Monocular Reconstruction of Parametric Faces

Igor Borovikov , Karine Levonyan , Jon Rein , Pawel Wrotek , Nitish Victor

分类：计算机视觉

2022-09-06

许多现代的在线3D应用程序和视频游戏都依靠人脸的参数模型来创建可信的化身。但是，用参数模型手动复制某人的面部相似性是困难且耗时的。该任务的机器学习解决方案是非常可取的，但也充满挑战。本文提出了一种新的方法来解决所谓的面对参数问题（简称F2P），旨在重建单个图像的参数面。所提出的方法利用合成数据，域分解和域适应来解决解决F2P的多方面挑战。开源代码库说明了我们的主要观察结果，并提供了定量评估的手段。提出的方法在工业应用中证明是实际的。它提高了准确性并允许更有效的模型培训。这些技术有可能扩展到其他类型的参数模型。

translated by 谷歌翻译

Applied monocular reconstruction of parametric faces with domain engineering

Igor Borovikov , Karine Levonyan , Jon Rein , Pawel Wrotek , Nitish Victor

分类：计算机视觉

2022-08-05

许多现代的在线3D应用程序和视频游戏依靠人面孔的参数模型来创建可信的化身。但是，使用参数模型对某人的面部相似性进行手动复制是困难且耗时的。该任务的机器学习解决方案是非常可取的，但也充满挑战。本文提出了一种新的方法来解决所谓的面对参数问题（简称F2P），旨在重建单个图像的参数面。所提出的方法利用合成数据，域分解和域适应来解决解决F2P的多方面挑战。开源代码库说明了我们的主要观察结果，并提供了定量评估的手段。提出的方法在工业应用中证明是实际的。它提高了准确性并允许更有效的模型培训。这些技术有可能扩展到其他类型的参数模型。

translated by 谷歌翻译

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

Diego Maupomé , Fanny Rancourt , Thomas Soulas , Alexandre Lachance , Marie-Jean Meurs , Desislava Aleksandrova , Olivier Brochu Dufour , Igor Pontes , Rémi Cardon , Michel Simard

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-26

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).

translated by 谷歌翻译

Ensemble learning techniques for intrusion detection system in the context of cybersecurity

Andricson Abeline Moreira , Carlos A. C. Tojeiro , Carlos J. Reis , Gustavo Henrique Massaro , Igor Andrade Brito e Kelton A. P. da Costa

分类：机器学习

2022-12-21

Recently, there has been an interest in improving the resources available in Intrusion Detection System (IDS) techniques. In this sense, several studies related to cybersecurity show that the environment invasions and information kidnapping are increasingly recurrent and complex. The criticality of the business involving operations in an environment using computing resources does not allow the vulnerability of the information. Cybersecurity has taken on a dimension within the universe of indispensable technology in corporations, and the prevention of risks of invasions into the environment is dealt with daily by Security teams. Thus, the main objective of the study was to investigate the Ensemble Learning technique using the Stacking method, supported by the Support Vector Machine (SVM) and k-Nearest Neighbour (kNN) algorithms aiming at an optimization of the results for DDoS attack detection. For this, the Intrusion Detection System concept was used with the application of the Data Mining and Machine Learning Orange tool to obtain better results

translated by 谷歌翻译

A scalable framework for annotating photovoltaic cell defects in electroluminescence images

Urtzi Otamendi , Inigo Martinez , Igor G. Olaizola , Marco Quartulli

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-15

The correct functioning of photovoltaic (PV) cells is critical to ensuring the optimal performance of a solar plant. Anomaly detection techniques for PV cells can result in significant cost savings in operation and maintenance (O&M). Recent research has focused on deep learning techniques for automatically detecting anomalies in Electroluminescence (EL) images. Automated anomaly annotations can improve current O&M methodologies and help develop decision-making systems to extend the life-cycle of the PV cells and predict failures. This paper addresses the lack of anomaly segmentation annotations in the literature by proposing a combination of state-of-the-art data-driven techniques to create a Golden Standard benchmark. The proposed method stands out for (1) its adaptability to new PV cell types, (2) cost-efficient fine-tuning, and (3) leverage public datasets to generate advanced annotations. The methodology has been validated in the annotation of a widely used dataset, obtaining a reduction of the annotation cost by 60%.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

WiSwarm: Age-of-Information-based Wireless Networking for Collaborative Teams of UAVs

Vishrant Tripathi , Igor Kadota , Ezra Tal , Muhammad Shahir Rahman , Alexander Warren , Sertac Karaman , Eytan Modiano

分类：机器人

2022-12-06

The Age-of-Information (AoI) metric has been widely studied in the theoretical communication networks and queuing systems literature. However, experimental evaluation of its applicability to complex real-world time-sensitive systems is largely lacking. In this work, we develop, implement, and evaluate an AoI-based application layer middleware that enables the customization of WiFi networks to the needs of time-sensitive applications. By controlling the storage and flow of information in the underlying WiFi network, our middleware can: (i) prevent packet collisions; (ii) discard stale packets that are no longer useful; and (iii) dynamically prioritize the transmission of the most relevant information. To demonstrate the benefits of our middleware, we implement a mobility tracking application using a swarm of UAVs communicating with a central controller via WiFi. Our experimental results show that, when compared to WiFi-UDP/WiFi-TCP, the middleware can improve information freshness by a factor of 109x/48x and tracking accuracy by a factor of 4x/6x, respectively. Most importantly, our results also show that the performance gains of our approach increase as the system scales and/or the traffic load increases.

translated by 谷歌翻译

SparsePose: Sparse-View Camera Pose Regression and Refinement

Samarth Sinha , Jason Y. Zhang , Andrea Tagliasacchi , Igor Gilitschenski , David B. Lindell

分类：计算机视觉

2022-11-29

Camera pose estimation is a key step in standard 3D reconstruction pipelines that operate on a dense set of images of a single object or scene. However, methods for pose estimation often fail when only a few images are available because they rely on the ability to robustly identify and match visual features between image pairs. While these methods can work robustly with dense camera views, capturing a large set of images can be time-consuming or impractical. We propose SparsePose for recovering accurate camera poses given a sparse set of wide-baseline images (fewer than 10). The method learns to regress initial camera poses and then iteratively refine them after training on a large-scale dataset of objects (Co3D: Common Objects in 3D). SparsePose significantly outperforms conventional and learning-based baselines in recovering accurate camera rotations and translations. We also demonstrate our pipeline for high-fidelity 3D reconstruction using only 5-9 images of an object.

translated by 谷歌翻译

Melting Pot 2.0

John P. Agapiou , Alexander Sasha Vezhnevets , Edgar A. Duéñez-Guzmán , Jayd Matyas , Yiran Mao , Peter Sunehag , Raphael Köster , Udari Madhushani , Kavya Kopparapu , Ramona Comanescu

分类：人工智能 | 神经与进化计算

2022-11-24

Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a "substrate") with a reference set of co-players (a "background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.

translated by 谷歌翻译

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

David Venuto , Sherry Yang , Pieter Abbeel , Doina Precup , Igor Mordatch , Ofir Nachum

分类：机器学习

2022-11-23

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with their logged game controls. We propose to circumvent this challenge by combining large but sparsely-annotated datasets from a \emph{target} environment of interest with fully-annotated datasets from various other \emph{source} environments. Our method, Action Limited PreTraining (ALPT), leverages the generalization capabilities of inverse dynamics modelling (IDM) to label missing action data in the target environment. We show that utilizing even one additional environment dataset of labelled data during IDM pretraining gives rise to substantial improvements in generating action labels for unannotated sequences. We evaluate our method on benchmark game-playing environments and show that we can significantly improve game performance and generalization capability compared to other approaches, using annotated datasets equivalent to only $12$ minutes of gameplay. Highlighting the power of IDM, we show that these benefits remain even when target and source environments share no common actions.

translated by 谷歌翻译